shout out to chatGPT helping me rewrite this.
https://chat.openai.com/share/adb3e59d-1e70-42d9-a7ae-b2a8f547bd6a
import librosa
import numpy as np
from IPython.display import Audio
import matplotlib.pyplot as plt
import holoviews as hv
hv.extension('bokeh')
file_names = ['audio/03-01-01-01-01-02-01.wav',
'audio/20 - 20,000 Hz Audio Sweep Range of Human Hearing.mp3',
'audio/videoplayback.mp3']
Audio as Time-Series DataΒΆ
Representation: Time-series data, typically called audio_data, is plotted as a graph where time is on the X-axis and magnitude (or amplitude) is on the Y-axis.
Information Content: This representation does capture all the information about the sound we hear. The fluctuations in amplitude over time represent the sound wave's pressure variations, which correspond to the sound we perceive.
Perception of Sound: It's crucial to distinguish between the physical properties of sound (amplitude, frequency) and how we perceive these properties (loudness, pitch). Amplitude in a time-series graph relates to the loudness of the sound, but not directly to its pitch or quality.
Challenges and IssuesΒΆ
Interpretation vs. Intuition: While the graph provides a direct view of sound amplitude over time, understanding the sound's characteristics from this alone can be non-intuitive.
For example:
- High amplitude followed by a sudden drop could indicate a loud sound abruptly stopping.
- A gradual increase in amplitude might suggest a sound gradually getting louder. However, without information on frequency, it's hard to determine the nature of the sound (e.g., a musical note rising in pitch).
Perception of Sound and Amplitude: Amplitude corresponds to the loudness of a sound, but our perception of sound is multidimensional, involving pitch (related to frequency), timbre (related to the complex structure of sound waves), and duration. Simply observing amplitude variations does not provide a complete picture of these aspects.
Time Shift Sensitivity: Time-series data is sensitive to time shifts. A slight shift in the time domain can change the appearance of the waveform, potentially leading to different interpretations, especially in complex sounds or music.
for i in range(len(file_names)):
file_name = file_names[i]
audio_data, sample_rate = librosa.load(file_name)
time = librosa.times_like(audio_data, sr=sample_rate)
plot = hv.Curve((time, audio_data)).opts(width=1100, height=400, title="Waveform: " + file_name)
display(plot)
display(Audio(data=audio_data, rate=sample_rate, autoplay=False))